Orthogroup expression analysis

Instead of relying on one-to-one orthologues, we incorporate in-paralogues into comparative transcriptomics analysis. As I have discussed in the article vignette of myTAI (see here), this approach has some limitations, in that while not explicitely transferring (potential) issues from the orthologue conjecture, some elements of this assumption is incoporated. Regardless, I think it still captures more information than using one-to-one orthologues.

This is different to the original filter because with the original filter, we wanted to retain as much genes as possible for TAI analysis. Here, we want to minimise species difference.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggfortify)
sample_info_fd <-
  readr::read_csv("data/sample_info_fd.csv")
## Rows: 34 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (9): SampleName, SeqProt, Species, stage_tissue, Stage, Tissue, Experime...
## dbl (1): Replicate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sample_info_fs <-
  readr::read_csv("data/sample_info_fs.csv")
## Rows: 48 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (9): SampleName, SeqProt, Species, stage_tissue, Stage, Tissue, Experime...
## dbl (1): Replicate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sample_info_ec <-
  readr::read_csv("data/sample_info_ec.csv")
## Rows: 57 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (9): SampleName, SeqProt, Species, stage_tissue, Stage, Tissue, Experime...
## dbl (1): Replicate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sample_info_fd_embryo <-
  sample_info_fd %>%
  dplyr::filter(Stage %in% c("E1", "E2", "E3", "E4", "E5", "E6")) %>%
  dplyr::mutate(LibLab = str_c(Species, "_", Stage, "_", Replicate))
sample_info_fs_embryo <-
  sample_info_fs %>%
  dplyr::filter(Stage %in% c("24H", "48H", "1w", "3w", "4w")) %>%
  dplyr::mutate(Replicate = case_when(
    LibName == "Fs_X_zygotes_48h_1" ~ 4,
    LibName == "Fs_X_zygotes_48h_2" ~ 5,
    LibName == "Fs_X_zygotes_48h_3" ~ 6,
    TRUE ~ Replicate
  )) %>%
  mutate(LibLab = str_c(Species, "_", Stage, "_", Replicate))
Fs_abundance <-
  readr::read_csv(file = "data/Fs_OG_abundance.csv")
## Rows: 5596 Columns: 49
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): OG
## dbl (48): Fs_F_gametes_1, Fs_F_gametes_2, Fs_F_gametes_3, Fs_M_gametes_1, Fs...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Fs_counts <-
  readr::read_csv(file = "data/Fs_OG_counts.csv")
## Rows: 5596 Columns: 49
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): OG
## dbl (48): Fs_F_gametes_1, Fs_F_gametes_2, Fs_F_gametes_3, Fs_M_gametes_1, Fs...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Fd_abundance <-
  readr::read_csv(file = "data/Fd_OG_abundance.csv")
## Rows: 5410 Columns: 35
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): OG
## dbl (34): Fd_f_gametes_1, Fd_f_gametes_2, Fd_f_gametes_3, Fd_m_gametes_1, Fd...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Fd_counts <-
  readr::read_csv(file = "data/Fd_OG_counts.csv")
## Rows: 5410 Columns: 35
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): OG
## dbl (34): Fd_f_gametes_1, Fd_f_gametes_2, Fd_f_gametes_3, Fd_m_gametes_1, Fd...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Ec_abundance <-
  readr::read_csv(file = "data/Ec_OG_abundance.csv")
## Rows: 7059 Columns: 58
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): OG
## dbl (57): M_meiospore_low_1, M_meiospore_low_2, M_meiospore_low_3, F_meiospo...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Ec_counts <-
  readr::read_csv(file = "data/Ec_OG_counts.csv")
## Rows: 7059 Columns: 58
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): OG
## dbl (57): M_meiospore_low_1, M_meiospore_low_2, M_meiospore_low_3, F_meiospo...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

PCA using OGs

selectGenes <- function(counts, min.count=2){
  keep <-
    counts[rowMeans(counts)>min.count, ]
  return(keep)
}

To minimise differences between species, I remove OGs with low counts.

combined_metatable <- 
  rbind(sample_info_fs_embryo, sample_info_fd_embryo) %>%
  relocate(LibName)
merged_TPM_pre <- 
  merge(Fs_abundance, Fd_abundance, by = "OG") %>%
  dplyr::select(1, combined_metatable$LibName)
merged_TPM <- 
  data.frame(row.names = merged_TPM_pre[,1], merged_TPM_pre[,-1], check.names = FALSE)
merged_TPM.filt <-
  as.matrix(merged_TPM) %>%
  selectGenes(min.count=10)

Through the filtering function, I retain 2889 / 3724 OGs

rlog_TPM <- DESeq2::rlog(merged_TPM.filt %>% round(0))
## rlog() may take a few minutes with 30 or more samples,
## vst() is a much faster transformation
## converting counts to integer mode
## -- note: fitType='parametric', but the dispersion trend was not well captured by the
##    function: y = a/x + b, and a local regression fit was automatically substituted.
##    specify fitType='local' or 'mean' to avoid this message next time.
log2_TPM <- log2(merged_TPM.filt+1)
sqrt_TPM <- sqrt(merged_TPM.filt)
tpm10_TPM <- (merged_TPM.filt > 10) * 1
combined_metatable$Stage <- factor(combined_metatable$Stage, levels = unique(combined_metatable$Stage))

myPCAfunction <- function(pcDat, data, x, y){
  OUT <- 
    ggplot2::autoplot(
      pcDat,
      data = data,
      frame.colour = "Stage",
      colour = "Stage",
      shape = "Species",
      x = x,
      y = y,
      size = 3,
      frame = TRUE,
      alpha = 0.5)
  return(OUT)
}

pcDat <- prcomp(t(rlog_TPM), scale = TRUE)
# pcDat <- prcomp(t(rlog_TPM))

PC1_2 <- 
  myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 2) +
  labs(title = "PC1 and PC2") +
  theme_grey()

PC1_3 <- 
  myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 3) +
  labs(title = "PC1 and PC3") +
  theme_grey()

rlog_plot <- ggpubr::ggarrange(PC1_2, PC1_3, common.legend = T, ncol = 1, nrow = 2)
ggpubr::annotate_figure(rlog_plot, top = ggpubr::text_grob("rlog", 
                                                   face = "bold", size = 14))

pcDat <- prcomp(t(log2_TPM), scale = TRUE)

PC1_2 <- myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 2) +
  labs(title = "PC1 and PC2") +
  theme_grey()

PC1_3 <- myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 3) +
  labs(title = "PC1 and PC3") +
  theme_grey()

log2_plot <- ggpubr::ggarrange(PC1_2, PC1_3, common.legend = T)
ggpubr::annotate_figure(log2_plot, top = ggpubr::text_grob("log2", 
                                                   face = "bold", size = 14))

pcDat <- prcomp(t(sqrt_TPM), scale = TRUE)
# pcDat <- prcomp(t(rlog_TPM))

PC1_2 <- myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 2) +
  labs(title = "PC1 and PC2") +
  theme_grey()

PC1_3 <- myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 3) +
  labs(title = "PC1 and PC3") +
  theme_grey()

sqrt_plot <- ggpubr::ggarrange(PC1_2, PC1_3, common.legend = T)
ggpubr::annotate_figure(sqrt_plot, top = ggpubr::text_grob("sqrt", 
                                                   face = "bold", size = 14))

pcDat <- prcomp(t(tpm10_TPM)) # too many zeros to do a scale = TRUE

PC1_2 <- myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 2) +
  labs(title = "PC1 and PC2") +
  theme_grey()

PC1_3 <- myPCAfunction(pcDat, data = combined_metatable, x = 1, y = 3) +
  labs(title = "PC1 and PC3") +
  theme_grey()

rlog_plot <- ggpubr::ggarrange(PC1_2, PC1_3, common.legend = T)
ggpubr::annotate_figure(rlog_plot, top = ggpubr::text_grob("tpm10 threshold", 
                                                   face = "bold", size = 14))

Distance metrics

Here, we use distance metrics to quantify how distant the stages are from each other across Fucus species and maybe even in Ectocarpus.

library(philentropy)
ncol_Fs <- grep( "Fs" , colnames( log2_TPM ) )
ncol_Fd <- grep( "Fd" , colnames( log2_TPM ) )

Using log2(TPM+1)

log2_TPM_renamed <- log2_TPM
base::colnames(log2_TPM_renamed) <- combined_metatable$LibLab

Pearson correlation

Linear relationship

M = cor(x = log2_TPM_renamed[,ncol_Fs], 
        y = log2_TPM_renamed[,ncol_Fd], 
        method = "pearson")

M_melt <- reshape2::melt(M)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Pearson correlation (log2)",
    fill = "Correlation") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Pearson correlation (log2)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Spearman correlation

Monotonic relationship

log2_TPM_renamed <- log2_TPM
base::colnames(log2_TPM_renamed) <- combined_metatable$LibLab

M = cor(x = log2_TPM_renamed[,ncol_Fs], 
        y = log2_TPM_renamed[,ncol_Fd], 
        method = "spearman")

M_melt <- reshape2::melt(M)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Spearman correlation (log2TPM)",
    fill = "Correlation") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Spearman correlation (log2)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Manhattan distance

Manhattan distance is a distance metric used to measure the distance between two points in a grid-like system, such as a city block or a chessboard. It is calculated as the sum of the absolute differences between the coordinates of the two points. The term “Manhattan” is used because the grid-like structure and the right-angled turns of the streets in Manhattan, New York, resemble a chessboard. The Manhattan distance is also known as the L1 norm or taxicab distance. It is commonly used in machine learning and data science to measure similarity between data points or to cluster data.

It is L_{1} norm unlike Euclidean L_{2} norm.

Because it uses absolute values instead of exponentiation and rooting, it is said to be more robust to outliers compared to the euclidean distance. Additionally, the modulus is much faster to compute than exponentiation. It is a special case of the Minkowski distance.

DM <- philentropy::distance(t(log2_TPM_renamed), use.row.names = TRUE, method = "manhattan")
## Metric: 'manhattan'; comparing: 34 vectors.
DM2 <- DM[ncol_Fs, ncol_Fd]

M_melt <- reshape2::melt(DM2)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Manhattan distance (log2)",
    fill = "Distance") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,0))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Manhattan distance (log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Euclidean distance

Euclidean distance is a measure of the straight-line distance between two points in a multi-dimensional space. It is calculated as the square root of the sum of the squared differences between the corresponding coordinates of the two points. The Euclidean distance is commonly used in clustering and classification algorithms, as well as in distance-based data analysis and visualization techniques.

Euclidean is not ideal for high-dimension data. But it is commonly used.

DM <- philentropy::distance(t(log2_TPM_renamed), use.row.names = TRUE, method = "euclidean")
## Metric: 'euclidean'; comparing: 34 vectors.
DM2 <- DM[ncol_Fs, ncol_Fd]

M_melt <- reshape2::melt(DM2)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Euclidean distance (log2)",
    fill = "Distance") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,1))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Euclidean distance (log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

JSD metric

Jensen-Shannon distance is a statistical distance metric that measures the similarity between two probability distributions. It is often used in data analysis and machine learning for clustering, classification, and anomaly detection tasks.

mat_normalized <- apply(log2_TPM_renamed, 2, function(x) x/sum(x)) %>% as.data.frame()
# colSums(mat_normalized)
DM <- philentropy::distance(t(mat_normalized), use.row.names = TRUE, method = "jensen-shannon")
## Metric: 'jensen-shannon' using unit: 'log'; comparing: 34 vectors.
DM2 <- DM[ncol_Fs, ncol_Fd]
DM3 <- sqrt(DM2)

M_melt <- reshape2::melt(DM3)
ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  geom_tile() +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus distichus samples",
       y = "Fucus serratus samples",
       title = "JSD metric (norm. log2)",
       fill = "Distance") +
  coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "JSD metric (norm. log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Using rlog(TPM)

I am using rlog data here because it reduces the distance since there is higher correlation between the low-count genes. It should be noted though that the transformation here isn’t monotonic.

rlog_TPM_renamed <- rlog_TPM
rlog_TPM_renamed <- rlog_TPM
base::colnames(rlog_TPM_renamed) <- combined_metatable$LibLab

Pearson correlation

base::colnames(rlog_TPM_renamed) <- combined_metatable$LibLab

M = cor(x = rlog_TPM_renamed[,ncol_Fs], 
        y = rlog_TPM_renamed[,ncol_Fd], 
        method = "pearson")

M_melt <- reshape2::melt(M)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Pearson correlation (rlog)",
    fill = "Correlation") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Pearson correlation (rlog)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Spearman correlation

rlog_TPM_renamed <- rlog_TPM
base::colnames(rlog_TPM_renamed) <- combined_metatable$LibLab

M = cor(x = rlog_TPM_renamed[,ncol_Fs], 
        y = rlog_TPM_renamed[,ncol_Fd], 
        method = "spearman")

M_melt <- reshape2::melt(M)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Spearman correlation (rlog)",
    fill = "Correlation") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Spearman correlation (rlog)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Manhattan distance

DM <- philentropy::distance(t(rlog_TPM_renamed), use.row.names = TRUE, method = "manhattan")
## Metric: 'manhattan'; comparing: 34 vectors.
DM2 <- DM[ncol_Fs, ncol_Fd]

M_melt <- reshape2::melt(DM2)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Manhattan distance (rlog)",
    fill = "Distance") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,0))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Manhattan distance (rlog)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Euclidean distance

DM <- philentropy::distance(t(rlog_TPM_renamed), use.row.names = TRUE, method = "euclidean")
## Metric: 'euclidean'; comparing: 34 vectors.
DM2 <- DM[ncol_Fs, ncol_Fd]

M_melt <- reshape2::melt(DM2)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
  ggplot2::labs(
    x = "Fucus distichus samples",
    y = "Fucus serratus samples",
    title = "Euclidean distance (rlog)",
    fill = "Distance") +
  ggplot2::coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,1))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "Euclidean distance (rlog)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

JSD metric

mat_normalized <- apply(rlog_TPM_renamed, 2, function(x) x/sum(x)) %>% as.data.frame()
# colSums(mat_normalized)
DM <- philentropy::distance(t(mat_normalized), use.row.names = TRUE, method = "jensen-shannon")
## Metric: 'jensen-shannon' using unit: 'log'; comparing: 34 vectors.
DM2 <- DM[ncol_Fs, ncol_Fd]
DM3 <- sqrt(DM2)

M_melt <- reshape2::melt(DM3)
ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  geom_tile() +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus distichus samples",
       y = "Fucus serratus samples",
       title = "JSD metric (norm. rlog)",
       fill = "Distance") +
  coord_flip()

tmp1 <- sample_info_fd_embryo %>% 
  select(Var2 = LibLab, Stage) %>% 
  full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var2)`
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2 %>% group_by(Fs, Fd) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Fd, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Fucus distichus samples",
       title = "JSD metric (norm. rlog)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Compared to Ectocarpus?

Here we use JSD metric.

sample_info_ec_proc <-
  sample_info_ec %>%
  dplyr::relocate(LibName) %>%
  dplyr::mutate(LibLab = str_c(Species, "_", Sex, "_", Stage, "_", Replicate))

duplicated(sample_info_ec_proc$LibLab)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [25]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
combined_metatable_inclEc <- 
  rbind(combined_metatable, sample_info_ec_proc)
merged_TPM_inclEc_pre <- 
  Ec_abundance %>%
  dplyr::select(1, sample_info_ec_proc$SampleName)

merged_TPM_inclEc_pre <-
  merged_TPM_pre %>%
  merge(merged_TPM_inclEc_pre)
merged_TPM_inclEc <- 
  data.frame(row.names = merged_TPM_inclEc_pre[,1], merged_TPM_inclEc_pre[,-1], check.names = FALSE)

merged_TPM.filt <-
  as.matrix(merged_TPM_inclEc) %>%
  selectGenes(min.count=10)

We keep 1620 OGs out of 1979 OGs.

log2_TPM <- log2(merged_TPM.filt + 1)
log2_TPM_renamed <- log2_TPM
base::colnames(log2_TPM_renamed) <- combined_metatable_inclEc$LibLab
ncol_Fs <- grep( "Fs" , colnames( log2_TPM_renamed ) )
ncol_Fd <- grep( "Fd" , colnames( log2_TPM_renamed ) )
ncol_Ec <- grep( "Ec_" , colnames( log2_TPM_renamed ) )

Pearson correlation

Fs & Ec

M = cor(x = log2_TPM_renamed[,ncol_Fs], 
        y = log2_TPM_renamed[,ncol_Ec], 
        method = "pearson")

M_melt <- reshape2::melt(M)
ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
  ggplot2::labs(
    x = "Ectocarpus samples",
    y = "Fucus serratus samples",
    title = "Pearson correlation (log2)",
    fill = "Correlation") +
  ggplot2::coord_flip()

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 397 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "Pearson correlation (log2)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "Pearson correlation (log2) [only multicellular]",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Fd & Ec

M = cor(x = log2_TPM_renamed[,ncol_Fd], 
        y = log2_TPM_renamed[,ncol_Ec], 
        method = "pearson")

M_melt <- reshape2::melt(M)

ggplot2::ggplot(M_melt, aes(x = Var2, y = Var1, fill = value)) +
  ggplot2::geom_tile() +
  ggplot2::theme_minimal() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
  ggplot2::labs(
    x = "Ectocarpus samples",
    y = "Fucus distichus samples",
    title = "Pearson correlation (log2)",
    fill = "Correlation") +
  ggplot2::coord_flip()

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 353 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fd_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "Pearson correlation (log2)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete'))  %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "Pearson correlation (log2) [only multicellular]",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

Spearman correlation

Fs & Ec

M = cor(x = log2_TPM_renamed[,ncol_Fs], 
        y = log2_TPM_renamed[,ncol_Ec], 
        method = "spearman")

M_melt <- reshape2::melt(M)

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 397 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "Spearman correlation (log2)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "Spearman correlation (log2) [only multicellular]",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Fd & Ec

M = cor(x = log2_TPM_renamed[,ncol_Fd], 
        y = log2_TPM_renamed[,ncol_Ec], 
        method = "spearman")

M_melt <- reshape2::melt(M)

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 353 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fd_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "Spearman correlation (log2)",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#DC0000FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "Spearman correlation (log2) [only multicellular]",
       fill = "median(correlation)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

JSD metric

Fs & Ec

mat_normalized <- apply(log2_TPM_renamed, 2, function(x) x/sum(x)) %>% as.data.frame()
# colSums(mat_normalized)
DM <- philentropy::distance(t(mat_normalized), use.row.names = TRUE, method = "jensen-shannon")
## Metric: 'jensen-shannon' using unit: 'log'; comparing: 91 vectors.
DM2 <- DM[ncol_Fs, ncol_Ec]
DM3 <- sqrt(DM2)

M_melt <- reshape2::melt(DM3)

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 397 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "JSD metric (norm. log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "JSD metric (norm. log2) [only multicellular]",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Fd & Ec

DM2 <- DM[ncol_Fd, ncol_Ec]
DM3 <- sqrt(DM2)

M_melt <- reshape2::melt(DM3)

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 353 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fd_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "JSD metric (norm. log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,3))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "JSD metric (norm. log2) [only multicellular]",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

Manhattan distance

Fs & Ec

DM <- philentropy::distance(t(log2_TPM_renamed), use.row.names = TRUE, method = "manhattan")
## Metric: 'manhattan'; comparing: 91 vectors.
DM2 <- DM[ncol_Fs, ncol_Ec]

M_melt <- reshape2::melt(DM2)

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 397 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fs_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fs = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fs <- factor(tmp2$Fs, levels = unique(tmp2$Fs))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,0))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "Manhattan distance (log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fs, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fs, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,0))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus serratus samples",
       y = "Ectocarpus samples",
       title = "Manhattan distance (log2) [only multicellular]",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fs'. You can override using the `.groups`
## argument.

Fd & Ec

DM2 <- DM[ncol_Fd, ncol_Ec]

M_melt <- reshape2::melt(DM2)

tmp1 <- sample_info_ec_proc %>% 
  dplyr::select(Var2 = LibLab, Stage, Sex) %>% 
  dplyr::mutate(StageSex = str_c(Stage, "_", Sex)) %>%
  dplyr::full_join(M_melt, multiple = "all") %>%
  dplyr::rename(Ec = StageSex) %>%
  dplyr::select(-Stage, -Sex)
## Joining with `by = join_by(Var2)`
## Warning in dplyr::full_join(., M_melt, multiple = "all"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 353 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.
tmp2 <- sample_info_fd_embryo %>% 
  select(Var1 = LibLab, Stage) %>% 
  full_join(tmp1, multiple = "all") %>%
  dplyr::rename(Fd = Stage)
## Joining with `by = join_by(Var1)`
tmp2$Fd <- factor(tmp2$Fd, levels = unique(tmp2$Fd))
tmp2$Ec <- factor(tmp2$Ec, levels = unique(tmp2$Ec))
tmp2 %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,0))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "Manhattan distance (log2)",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

tmp2 %>% dplyr::filter(!stringr::str_detect(Ec, 'spore|gamete')) %>% group_by(Fd, Ec) %>% summarise(median_corr = median(value)) %>%
  ggplot(aes(x = Fd, y = Ec, fill = median_corr)) +
  geom_tile() +
  geom_text(aes(label = paste0(round(median_corr,0))), color = "black") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
  scale_fill_gradient(low = "white", high = "#3C5488FF") +
  labs(x = "Fucus distichus samples",
       y = "Ectocarpus samples",
       title = "Manhattan distance (log2) [only multicellular]",
       fill = "median(distance)")
## `summarise()` has grouped output by 'Fd'. You can override using the `.groups`
## argument.

Essentially, the difference between Fucus species and Ectocarpus is minimised during the stage of the lowest TAI.

Wesentlich wird der Unterschied zwischen Fucus-Arten und Ectocarpus während der Phasen niedriges TAIs verringert.

Males and female Ectocarpus compared

rlog transform

Ec_PES_32m.rlog <-
        readr::read_csv(file = "data/Ec_PES_32m.rlog.csv") %>%
        dplyr::select(-1)
## Rows: 11571 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): GeneID
## dbl (10): PS, meiospore, immGA, matGA, oldGA, gamete, earlyPSP, immPSP, matP...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Rename the columns with the prefix "M_"
new_colnames <- c("GeneID", paste0("M_", colnames(Ec_PES_32m.rlog)[-1]))
colnames(Ec_PES_32m.rlog) <- new_colnames


Ec_PES_25f.rlog <-
        readr::read_csv(file = "data/Ec_PES_25f.rlog.csv") %>%
        dplyr::select(-1)
## Rows: 11571 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): GeneID
## dbl (10): PS, meiospore, immGA, matGA, oldGA, gamete, earlyPSP, immPSP, matP...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Rename the columns with the prefix "F_"
new_colnames <- c("GeneID", paste0("F_", colnames(Ec_PES_25f.rlog)[-1]))
colnames(Ec_PES_25f.rlog) <- new_colnames

Ec.rlog <- inner_join(Ec_PES_32m.rlog, Ec_PES_25f.rlog) %>%
        tibble::column_to_rownames(var = "GeneID")
## Joining with `by = join_by(GeneID)`
ncol_Female <- grep( "F_" , colnames( Ec.rlog ) )
ncol_Male <- grep( "M_" , colnames( Ec.rlog ) )

Pearson correlation

M = cor(x = Ec.rlog[,ncol_Female],
        y = Ec.rlog[,ncol_Male],
        method = "pearson")

M_melt <- reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Pearson",
                fill = "Correlation")

reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        ggplot2::ggplot(aes(x = Var1, y = Var2, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "") +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Female stages",
                y = "Male stages",
                title = "Pearson correlation")
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Pearson") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "rlog")

Spearman correlation

M = cor(x = Ec.rlog[,ncol_Female],
        y = Ec.rlog[,ncol_Male],
        method = "spearman")

M_melt <- reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Spearman",
                fill = "Correlation")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Spearman") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "rlog") %>%
  base::rbind(M_melt_sex)

JSD metric

Traditional Jensen-Shannon divergence (JSD) ranges between [0, 1]. Due to rlog transform creating less than 0 counts, I had to remove rows with at least one column with less than 0 counts since the normalisation step wouldn’t work. This allows us to analyse 7159 genes instead of 11571 genes.

library(matrixStats)
## 
## Attaching package: 'matrixStats'
## The following object is masked from 'package:dplyr':
## 
##     count
Ec.rlog_filt <- Ec.rlog[matrixStats::rowMins(as.matrix(Ec.rlog)) > 0, ]
mat_normalized <- apply(Ec.rlog_filt, 2, function(x) x/sum(x)) %>% as.data.frame()
DM <- philentropy::distance(t(mat_normalized), use.row.names = TRUE, method = "jensen-shannon")
## Metric: 'jensen-shannon' using unit: 'log'; comparing: 18 vectors.
DM2 <- sqrt(DM)
M_melt <- reshape2::melt(DM2) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,3))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "JSD")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "JSD") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "rlog") %>%
  base::rbind(M_melt_sex)

Manhattan

DM <- philentropy::distance(t(Ec.rlog), use.row.names = TRUE, method = "manhattan")
## Metric: 'manhattan'; comparing: 18 vectors.
M_melt <- reshape2::melt(DM) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,0))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Manhattan")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Manhattan") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "rlog") %>%
  base::rbind(M_melt_sex)

log2(x+1) transform

sample_info_ec_proc <-
        sample_info_ec %>%
        dplyr::relocate(LibName) %>% dplyr::filter(!Sex == "H") %>%
        dplyr::mutate(LibLab = str_c(Sex, "_", Stage))

Ec_abundance <-
        readr::read_csv(file = "data/Ec_abundance.csv") %>%
        dplyr::select(1, sample_info_ec_proc$SampleName) %>%
        tibble::column_to_rownames(var = "GeneID")
## Rows: 11582 Columns: 58
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): GeneID
## dbl (57): M_meiospore_low_1, M_meiospore_low_2, M_meiospore_low_3, F_meiospo...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
merged_TPM.filt <-
  as.matrix(Ec_abundance)
log2_TPM <- log2(merged_TPM.filt + 1)

log2_TPM_renamed <- log2_TPM
base::colnames(log2_TPM_renamed) <- sample_info_ec_proc$LibLab
ncol_Female <- grep( "F_" , colnames( log2_TPM_renamed ) )
ncol_Male <- grep( "M_" , colnames( log2_TPM_renamed ) )

Pearson correlation

M = cor(x = log2_TPM_renamed[,ncol_Female],
        y = log2_TPM_renamed[,ncol_Male],
        method = "pearson")

M_melt <- reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Pearson")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Pearson") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

Spearman correlation

M = cor(x = log2_TPM_renamed[,ncol_Female],
        y = log2_TPM_renamed[,ncol_Male],
        method = "spearman")

M_melt <- reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Spearman")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Spearman") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

JSD metric

mat_normalized <- apply(log2_TPM_renamed, 2, function(x) x/sum(x)) %>% as.data.frame()
DM <- philentropy::distance(t(mat_normalized), use.row.names = TRUE, method = "jensen-shannon")
## Metric: 'jensen-shannon' using unit: 'log'; comparing: 54 vectors.
DM2 <- sqrt(DM)
M_melt <- reshape2::melt(DM2) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,3))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "JSD")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "JSD") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

Manhattan distance

DM <- philentropy::distance(t(log2_TPM_renamed), use.row.names = TRUE, method = "manhattan")
## Metric: 'manhattan'; comparing: 54 vectors.
M_melt <- reshape2::melt(DM) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,0))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Manhattan")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Manhattan") %>%
  dplyr::mutate(denoised = FALSE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

denoised data

Ec_abundance.denoised <-
        readr::read_csv(file = "data/Ec_abundance_denoised.csv") %>%
        dplyr::select(1, sample_info_ec_proc$SampleName) %>%
        tibble::column_to_rownames(var = "GeneID")
## Rows: 2726 Columns: 58
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): GeneID
## dbl (57): M_meiospore_low_1, M_meiospore_low_2, M_meiospore_low_3, F_meiospo...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
merged_TPM.filt <-
  as.matrix(Ec_abundance.denoised)
log2_TPM <- log2(merged_TPM.filt + 1)

log2_TPM_renamed <- log2_TPM
base::colnames(log2_TPM_renamed) <- sample_info_ec_proc$LibLab
ncol_Female <- grep( "F_" , colnames( log2_TPM_renamed ) )
ncol_Male <- grep( "M_" , colnames( log2_TPM_renamed ) )

Pearson correlation

M = cor(x = log2_TPM_renamed[,ncol_Female],
        y = log2_TPM_renamed[,ncol_Male],
        method = "pearson")

reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        ggplot2::ggplot(aes(x = Var1, y = Var2, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "") +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Female stages",
                y = "Male stages",
                title = "Pearson correlation")
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.

M_melt <- reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Pearson",
                fill = "Correlation")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Pearson") %>%
  dplyr::mutate(denoised = TRUE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

Spearman correlation

M = cor(x = log2_TPM_renamed[,ncol_Female],
        y = log2_TPM_renamed[,ncol_Male],
        method = "spearman")

reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        ggplot2::ggplot(aes(x = Var1, y = Var2, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "") +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Female stages",
                y = "Male stages",
                title = "Spearman correlation")
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.

M_melt <- reshape2::melt(M) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,2))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#DC0000FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Spearman",
                fill = "Correlation")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Spearman") %>%
  dplyr::mutate(denoised = TRUE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

JSD metric

mat_normalized <- apply(log2_TPM_renamed, 2, function(x) x/sum(x)) %>% as.data.frame()
DM <- philentropy::distance(t(mat_normalized), use.row.names = TRUE, method = "jensen-shannon")
## Metric: 'jensen-shannon' using unit: 'log'; comparing: 54 vectors.
DM2 <- sqrt(DM)
M_melt <- reshape2::melt(DM2) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,3))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "JSD")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "JSD") %>%
  dplyr::mutate(denoised = TRUE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)

Manhattan distance

DM <- philentropy::distance(t(log2_TPM_renamed), use.row.names = TRUE, method = "manhattan")
## Metric: 'manhattan'; comparing: 54 vectors.
M_melt <- reshape2::melt(DM) %>%
        dplyr::group_by(Var1, Var2) %>% 
        dplyr::summarise(median_corr = median(value)) %>%
        dplyr::filter(stringr::str_remove(Var2, "M_") == stringr::str_remove(Var1, "F_"))
## `summarise()` has grouped output by 'Var1'. You can override using the
## `.groups` argument.
M_melt_processed <- M_melt %>% mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_processed$Stage <- factor(M_melt_processed$Stage, levels = unique(M_melt_processed$Stage))

ggplot2::ggplot(M_melt_processed, aes(x = Stage, y = 1, fill = median_corr, label = round(median_corr,0))) +
        ggplot2::geom_tile() +
        ggplot2::geom_text() +
        ggplot2::theme_minimal() +
        ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
                       legend.position = "",
                       axis.text.y=element_blank()) +
        ggplot2::scale_fill_gradient(low = "white", high = "#3C5488FF") +
        ggplot2::labs(
                x = "Ectocarpus stages",
                y = "Manhattan")

# save this for a final plot at the end with all measures
M_melt_sex <- 
  M_melt %>%
  dplyr::mutate(metric = "Manhattan") %>%
  dplyr::mutate(denoised = TRUE) %>%
  dplyr::mutate(transformation = "log2") %>%
  base::rbind(M_melt_sex)
normalit<-function(m){
   (m - min(m))/(max(m)-min(m))
 }

M_melt_sex_processed <-
  M_melt_sex %>% 
  mutate(Stage = stringr::str_remove(Var2, "M_"))
M_melt_sex_processed$Stage <- factor(M_melt_sex_processed$Stage, levels = unique(M_melt_sex_processed$Stage))

metrics = c("Pearson", "Spearman")

M_melt_sex_processed_norm <-
  M_melt_sex_processed %>%
  dplyr::group_by(metric, denoised, transformation) %>%
  mutate(norm = normalit(median_corr))

M_melt_sex_processed %>%
  dplyr::filter(metric %in% metrics) %>%
  ggplot2::ggplot(aes(x = Stage, 
                      y = 1- median_corr, 
                      shape = denoised,
                      colour = transformation,
                      group = interaction(metric, denoised, transformation)
                      )) +
  ggplot2::geom_line() +
  ggplot2::geom_point(size = 3) +
  ggplot2::coord_flip() +
  ggplot2::facet_grid(. ~ metric) +
  # ggplot2::theme_classic() +
  ggplot2::labs(
          x = "Stage",
          y = "1 - correlation"
  ) +
  ggsci::scale_colour_npg() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

M_melt_sex_processed_norm %>%
  dplyr::filter(!metric %in% metrics) %>%
  ggplot2::ggplot(aes(x = Stage, 
                      y = norm, 
                      shape = denoised,
                      colour = transformation,
                      group = interaction(metric, denoised, transformation)
                      )) +
  ggplot2::geom_line() +
  ggplot2::geom_point(size = 3) +
  ggplot2::scale_y_continuous(breaks = scales::pretty_breaks(n = 2)) +
  # ggplot2::scale_y_log10() +
  # ggplot2::annotation_logticks(sides = "b") +
  ggplot2::coord_flip() +
  ggplot2::facet_grid(. ~ metric) +
  # ggplot2::theme_classic() +
  ggplot2::labs(
          x = "Stage",
          y = "Relative distance between sexes"
  ) +
  ggsci::scale_colour_npg() +
  ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.2.2 (2022-10-31)
##  os       macOS Big Sur ... 10.16
##  system   x86_64, darwin17.0
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Berlin
##  date     2024-01-10
##  pandoc   3.1.6.2 @ /usr/local/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package              * version   date (UTC) lib source
##  abind                  1.4-5     2016-07-21 [1] CRAN (R 4.2.0)
##  annotate               1.76.0    2022-11-01 [1] Bioconductor
##  AnnotationDbi          1.60.2    2023-03-10 [1] Bioconductor
##  backports              1.4.1     2021-12-13 [1] CRAN (R 4.2.0)
##  Biobase                2.58.0    2022-11-01 [1] Bioconductor
##  BiocGenerics           0.44.0    2022-11-01 [1] Bioconductor
##  BiocParallel           1.32.6    2023-03-17 [1] Bioconductor
##  Biostrings             2.66.0    2022-11-01 [1] Bioconductor
##  bit                    4.0.5     2022-11-15 [1] CRAN (R 4.2.0)
##  bit64                  4.0.5     2020-08-30 [1] CRAN (R 4.2.0)
##  bitops                 1.0-7     2021-04-24 [1] CRAN (R 4.2.0)
##  blob                   1.2.4     2023-03-17 [1] CRAN (R 4.2.0)
##  broom                  1.0.5     2023-06-09 [1] CRAN (R 4.2.0)
##  bslib                  0.5.1     2023-08-11 [1] CRAN (R 4.2.0)
##  cachem                 1.0.8     2023-05-01 [1] CRAN (R 4.2.0)
##  callr                  3.7.3     2022-11-02 [1] CRAN (R 4.2.0)
##  car                    3.1-2     2023-03-30 [1] CRAN (R 4.2.0)
##  carData                3.0-5     2022-01-06 [1] CRAN (R 4.2.0)
##  cli                    3.6.1     2023-03-23 [1] CRAN (R 4.2.0)
##  codetools              0.2-19    2023-02-01 [1] CRAN (R 4.2.0)
##  colorspace             2.1-0     2023-01-23 [1] CRAN (R 4.2.0)
##  cowplot                1.1.1     2020-12-30 [1] CRAN (R 4.2.0)
##  crayon                 1.5.2     2022-09-29 [1] CRAN (R 4.2.0)
##  DBI                    1.1.3     2022-06-18 [1] CRAN (R 4.2.0)
##  DelayedArray           0.24.0    2022-11-01 [1] Bioconductor
##  DESeq2                 1.38.3    2023-01-19 [1] Bioconductor
##  devtools               2.4.5     2022-10-11 [1] CRAN (R 4.2.0)
##  digest                 0.6.33    2023-07-07 [1] CRAN (R 4.2.0)
##  dplyr                * 1.1.3     2023-09-03 [1] CRAN (R 4.2.0)
##  ellipsis               0.3.2     2021-04-29 [1] CRAN (R 4.2.0)
##  evaluate               0.22      2023-09-29 [1] CRAN (R 4.2.2)
##  fansi                  1.0.5     2023-10-08 [1] CRAN (R 4.2.2)
##  farver                 2.1.1     2022-07-06 [1] CRAN (R 4.2.0)
##  fastmap                1.1.1     2023-02-24 [1] CRAN (R 4.2.0)
##  forcats              * 1.0.0     2023-01-29 [1] CRAN (R 4.2.0)
##  fs                     1.6.3     2023-07-20 [1] CRAN (R 4.2.0)
##  geneplotter            1.76.0    2022-11-01 [1] Bioconductor
##  generics               0.1.3     2022-07-05 [1] CRAN (R 4.2.0)
##  GenomeInfoDb           1.34.9    2023-02-02 [1] Bioconductor
##  GenomeInfoDbData       1.2.9     2023-05-04 [1] Bioconductor
##  GenomicRanges          1.50.2    2022-12-16 [1] Bioconductor
##  ggfortify            * 0.4.16    2023-03-20 [1] CRAN (R 4.2.0)
##  ggplot2              * 3.4.4     2023-10-12 [1] CRAN (R 4.2.2)
##  ggpubr                 0.6.0     2023-02-10 [1] CRAN (R 4.2.0)
##  ggsci                  3.0.0     2023-03-08 [1] CRAN (R 4.2.0)
##  ggsignif               0.6.4     2022-10-13 [1] CRAN (R 4.2.0)
##  glue                   1.6.2     2022-02-24 [1] CRAN (R 4.2.0)
##  gridExtra              2.3       2017-09-09 [1] CRAN (R 4.2.0)
##  gtable                 0.3.4     2023-08-21 [1] CRAN (R 4.2.0)
##  hms                    1.1.3     2023-03-21 [1] CRAN (R 4.2.0)
##  htmltools              0.5.6.1   2023-10-06 [1] CRAN (R 4.2.2)
##  htmlwidgets            1.6.2     2023-03-17 [1] CRAN (R 4.2.0)
##  httpuv                 1.6.11    2023-05-11 [1] CRAN (R 4.2.2)
##  httr                   1.4.7     2023-08-15 [1] CRAN (R 4.2.0)
##  IRanges                2.32.0    2022-11-01 [1] Bioconductor
##  jquerylib              0.1.4     2021-04-26 [1] CRAN (R 4.2.0)
##  jsonlite               1.8.7     2023-06-29 [1] CRAN (R 4.2.0)
##  KEGGREST               1.38.0    2022-11-01 [1] Bioconductor
##  knitr                  1.44      2023-09-11 [1] CRAN (R 4.2.0)
##  labeling               0.4.3     2023-08-29 [1] CRAN (R 4.2.0)
##  later                  1.3.1     2023-05-02 [1] CRAN (R 4.2.2)
##  lattice                0.21-9    2023-10-01 [1] CRAN (R 4.2.2)
##  lifecycle              1.0.3     2022-10-07 [1] CRAN (R 4.2.0)
##  locfit                 1.5-9.8   2023-06-11 [1] CRAN (R 4.2.0)
##  lubridate            * 1.9.3     2023-09-27 [1] CRAN (R 4.2.0)
##  magrittr               2.0.3     2022-03-30 [1] CRAN (R 4.2.0)
##  Matrix                 1.5-4.1   2023-05-18 [1] CRAN (R 4.2.0)
##  MatrixGenerics         1.10.0    2022-11-01 [1] Bioconductor
##  matrixStats          * 1.0.0     2023-06-02 [1] CRAN (R 4.2.0)
##  memoise                2.0.1     2021-11-26 [1] CRAN (R 4.2.0)
##  mime                   0.12      2021-09-28 [1] CRAN (R 4.2.0)
##  miniUI                 0.1.1.1   2018-05-18 [1] CRAN (R 4.2.0)
##  munsell                0.5.0     2018-06-12 [1] CRAN (R 4.2.0)
##  philentropy          * 0.7.0     2022-11-05 [1] CRAN (R 4.2.0)
##  pillar                 1.9.0     2023-03-22 [1] CRAN (R 4.2.0)
##  pkgbuild               1.4.2     2023-06-26 [1] CRAN (R 4.2.0)
##  pkgconfig              2.0.3     2019-09-22 [1] CRAN (R 4.2.0)
##  pkgload                1.3.3     2023-09-22 [1] CRAN (R 4.2.0)
##  plyr                   1.8.9     2023-10-02 [1] CRAN (R 4.2.2)
##  png                    0.1-8     2022-11-29 [1] CRAN (R 4.2.0)
##  prettyunits            1.2.0     2023-09-24 [1] CRAN (R 4.2.0)
##  processx               3.8.2     2023-06-30 [1] CRAN (R 4.2.0)
##  profvis                0.3.8     2023-05-02 [1] CRAN (R 4.2.0)
##  promises               1.2.1     2023-08-10 [1] CRAN (R 4.2.2)
##  ps                     1.7.5     2023-04-18 [1] CRAN (R 4.2.0)
##  purrr                * 1.0.2     2023-08-10 [1] CRAN (R 4.2.2)
##  R6                     2.5.1     2021-08-19 [1] CRAN (R 4.2.0)
##  RColorBrewer           1.1-3     2022-04-03 [1] CRAN (R 4.2.0)
##  Rcpp                   1.0.11    2023-07-06 [1] CRAN (R 4.2.0)
##  RCurl                  1.98-1.12 2023-03-27 [1] CRAN (R 4.2.0)
##  readr                * 2.1.4     2023-02-10 [1] CRAN (R 4.2.0)
##  remotes                2.4.2.1   2023-07-18 [1] CRAN (R 4.2.2)
##  reshape2               1.4.4     2020-04-09 [1] CRAN (R 4.2.0)
##  rlang                  1.1.1     2023-04-28 [1] CRAN (R 4.2.0)
##  rmarkdown              2.25      2023-09-18 [1] CRAN (R 4.2.2)
##  RSQLite                2.3.1     2023-04-03 [1] CRAN (R 4.2.0)
##  rstatix                0.7.2     2023-02-01 [1] CRAN (R 4.2.0)
##  rstudioapi             0.15.0    2023-07-07 [1] CRAN (R 4.2.0)
##  S4Vectors              0.36.2    2023-02-26 [1] Bioconductor
##  sass                   0.4.7     2023-07-15 [1] CRAN (R 4.2.0)
##  scales                 1.2.1     2022-08-20 [1] CRAN (R 4.2.0)
##  sessioninfo            1.2.2     2021-12-06 [1] CRAN (R 4.2.0)
##  shiny                  1.7.5.1   2023-10-14 [1] CRAN (R 4.2.2)
##  stringi                1.7.12    2023-01-11 [1] CRAN (R 4.2.0)
##  stringr              * 1.5.0     2022-12-02 [1] CRAN (R 4.2.0)
##  SummarizedExperiment   1.28.0    2022-11-01 [1] Bioconductor
##  tibble               * 3.2.1     2023-03-20 [1] CRAN (R 4.2.0)
##  tidyr                * 1.3.0     2023-01-24 [1] CRAN (R 4.2.0)
##  tidyselect             1.2.0     2022-10-10 [1] CRAN (R 4.2.0)
##  tidyverse            * 2.0.0     2023-02-22 [1] CRAN (R 4.2.0)
##  timechange             0.2.0     2023-01-11 [1] CRAN (R 4.2.0)
##  tzdb                   0.4.0     2023-05-12 [1] CRAN (R 4.2.2)
##  urlchecker             1.0.1     2021-11-30 [1] CRAN (R 4.2.0)
##  usethis                2.2.2     2023-07-06 [1] CRAN (R 4.2.0)
##  utf8                   1.2.3     2023-01-31 [1] CRAN (R 4.2.0)
##  vctrs                  0.6.4     2023-10-12 [1] CRAN (R 4.2.2)
##  vroom                  1.6.4     2023-10-02 [1] CRAN (R 4.2.2)
##  withr                  2.5.1     2023-09-26 [1] CRAN (R 4.2.0)
##  xfun                   0.40      2023-08-09 [1] CRAN (R 4.2.2)
##  XML                    3.99-0.14 2023-03-19 [1] CRAN (R 4.2.0)
##  xtable                 1.8-4     2019-04-21 [1] CRAN (R 4.2.0)
##  XVector                0.38.0    2022-11-01 [1] Bioconductor
##  yaml                   2.3.7     2023-01-23 [1] CRAN (R 4.2.0)
##  zlibbioc               1.44.0    2022-11-01 [1] Bioconductor
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────